All Questions
Tagged with linear-regressionscikit-learn
90 questions
1vote
0answers
32views
Predicting PGA Tour results with Linear Regression
I have curated a dataset from various online sources that contains information about each PGA player's weekly performance/trends. I'm attempting to predict their finishing positions at the next ...
0votes
1answer
37views
Multivariate linear regression via scikit and statsmodels
want to preface this first with terminology: multivariate regression deals with the case where there are more than one dependent variables while multiple regression deals with the case where there is ...
2votes
4answers
192views
My results from linear regression differs from my collegues despite having same data. Is this to be expected?
Long story short: Guy who did these calculations quit and did not leave any code behind. Now I am tasked with recreating the necessarry calculations to perform this years calculations - but my results ...
0votes
0answers
21views
What's the difference between my OLS from scratch vs sklearn's OLS?
I'm coding linear regression via OLS from scratch. When I compare the results to scikit-learn's implementation, the coefficients in my version appear to be twice the magnitude of scikit-learn's. I'm ...
2votes
1answer
158views
Linear regression with confidence interval
I am running a multivariate linear regression on noisy data, where the amount of error for each measurement is known (or at least estimated). It works reasonably well with weighted linear regression ...
1vote
1answer
57views
Minimize $\sum_i||Y_i-AX_i||^2$
I have N data vectors $X_i$ and N target vectors $Y_i$ where $i$ indexes the sample. I would like to learn a linear map $A$ between the data and the target i.e find the matrix $A$ that minimize $$\...
0votes
1answer
51views
What can I do do address a regression with systematic bias towards the middle?
I’ve created a linear regression but my predicted output is usually too low for true high values and too high for true low values. I’ve tried introducing a pipeline where I use polynomial features, ...
0votes
1answer
953views
Linear Regression line not showing in plot
It's a silly problem, I know, but it's getting my nerves. Everything seems fine, but I cannot get the line to show on the plot. I've put it in a public Google notebook, for your convenience. t ...
0votes
1answer
4kviews
ValueError: Found unknown categories ['IR', 'HN', 'MT', 'PH', 'NZ', 'CZ', 'MD'] in column 3 during transform
I am trying to use Linear Regression, to predict salary in USD. I have the following data: Data: 607 records Numerical columns: year, salary, salary in USD Categorical columns: experience, type, ...
0votes
1answer
751views
Feature scaling in Linear Regression
I always use Linearregression() class in sklearn library for creating a linear regression model. According to my understanding, we need feature scaling in linear ...
2votes
1answer
1kviews
Dummy Variable trap in Linear Regression
The dummy variable trap is a common problem with linear regression when dealing with categorical variables, since one hot encoding introduces redundancy, so if we have m categories in our categorical ...
0votes
1answer
1kviews
What Equation is model.coef_ Derived From? (SKLearn)
Fairly simple question, but something I've been unable to understand firmly by scouring the interwebs. After running a LR model using SKlearn, one of the key outputs is ...
0votes
0answers
48views
How to Approach Linear Machine-Learning Model When Input Variables are Inconsistent
Disclaimer: I'm relatively new to the data science and ML world -- still trying to get a firm grasp on the fundamentals. I'm trying to overcome a regression challenge involving a large, multi-...
1vote
0answers
336views
Multi Linear Regression on String Values
I'm using datasets which involves mostly of string values. The main outcome of the project is that it should predict success. Now I can use OneHotEncoding to convert string values in numerical format ...
0votes
1answer
220views
scikit-learn: feature analysis differs heavily from model coefficients
I am trying to perform linear regression and I want to analyse the available features beforehand. The task is to predict the value of a house. Some of them might have a high impact on the label, ...